Visual Reasoning with Natural Language
نویسندگان
چکیده
Natural language provides a widely accessible and expressive interface for robotic agents. To understand language in complex environments, agents must reason about the full range of language inputs and their correspondence to the world. For example, consider the scenario and instruction in Figure 1. To execute the instruction, the robot must identify the top shelf, recognize the two stacks as sets of items, compare items, and reason about the content and size of the sets. Such reasoning over language and vision is an open problem that is receiving increasing attention (Antol et al. 2015; Chen et al. 2015; Johnson et al. 2016). While existing data sets focus on visual diversity, they do not display the full range of natural language expressions, such as counting, set reasoning, and comparisons. We propose a simple natural language visual reasoning task, where the goal is to predict if a descriptive statement paired with an image is true for the image. This abstract describes our existing synthetic images corpus (Suhr et al. 2017) and current work on collecting real vision data.
منابع مشابه
Visual common-sense for scene understanding using perception, semantic parsing and reasoning
In this paper we explore the use of visual commonsense knowledge and other kinds of knowledge (such as domain knowledge, background knowledge, linguistic knowledge) for scene understanding. In particular, we combine visual processing with techniques from natural language understanding (especially semantic parsing), common-sense reasoning and knowledge representation and reasoning to improve vis...
متن کاملA Corpus of Natural Language for Visual Reasoning
We present a new visual reasoning language dataset, containing 92,244 pairs of examples of natural statements grounded in synthetic images with 3,962 unique sentences. We describe a method of crowdsourcing linguistically-diverse data, and present an analysis of our data. The data demonstrates a broad set of linguistic phenomena, requiring visual and set-theoretic reasoning. We experiment with v...
متن کاملA RISC Approach to Reasoning with Natural Language
Many developers of natural language (NL) processing systems believe that a single internal representation (of sentences) should support all levels of reasoning (linguistic, semantic, and pragmatic). Such a representation is necessarily quite expressive, containing a large variety of syntactic constructs necessary to model the nuances of natural language. Others (including ourselves) take the vi...
متن کاملObject-based reasoning in VQA
Visual Question Answering (VQA) is a novel problem domain where multi-modal inputs must be processed in order to solve the task given in the form of a natural language. As the solutions inherently require to combine visual and natural language processing with abstract reasoning, the problem is considered as AI-complete. Recent advances indicate that using high-level, abstract facts extracted fr...
متن کاملExplicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-ofthe-art systems attempted to solve the task using deep neural architectures and achieved promising performance. ...
متن کاملComparison of Moral Reasoning among Students with and without Visual Impairment
Background and Purpose: Some research has examined the moral reasoning and judgment in students with special needs and has shown that these students are lagging behind their non-disabled counterparts in term of moral development. Very few studies have been done in the area of development of moral reasoning in individuals with visual impairment; so given the research vacuum in this context, the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.00453 شماره
صفحات -
تاریخ انتشار 2017